Search CORE

12 research outputs found

Action recognition in depth videos using nonparametric probabilistic graphical models

Author: Raman Natraj
Publication venue
Publication date
Field of study

Action recognition involves automatically labelling videos that contain human motion with action classes. It has applications in diverse areas such as smart surveillance, human computer interaction and content retrieval. The recent advent of depth sensing technology that produces depth image sequences has offered opportunities to solve the challenging action recognition problem. The depth images facilitate robust estimation of a human skeleton’s 3D joint positions and a high level action can be inferred from a sequence of these joint positions. A natural way to model a sequence of joint positions is to use a graphical model that describes probabilistic dependencies between the observed joint positions and some hidden state variables. A problem with these models is that the number of hidden states must be fixed a priori even though for many applications this number is not known in advance. This thesis proposes nonparametric variants of graphical models with the number of hidden states automatically inferred from data. The inference is performed in a full Bayesian setting by using the Dirichlet Process as a prior over the model’s infinite dimensional parameter space. This thesis describes three original constructions of nonparametric graphical models that are applied in the classification of actions in depth videos. Firstly, the action classes are represented by a Hidden Markov Model (HMM) with an unbounded number of hidden states. The formulation enables information sharing and discriminative learning of parameters. Secondly, a hierarchical HMM with an unbounded number of actions and poses is used to represent activities. The construction produces a simplified model for activity classification by using logistic regression to capture the relationship between action states and activity labels. Finally, the action classes are modelled by a Hidden Conditional Random Field (HCRF) with the number of intermediate hidden states learned from data. Tractable inference procedures based on Markov Chain Monte Carlo (MCMC) techniques are derived for all these constructions. Experiments with multiple benchmark datasets confirm the efficacy of the proposed approaches for action recognition

Birkbeck Institutional Research Online

Synthetic Text Generation using Hypergraph Representations

Author: Raman Natraj
Shah Sameena
Publication venue
Publication date: 06/09/2023
Field of study

Generating synthetic variants of a document is often posed as text-to-text transformation. We propose an alternate LLM based method that first decomposes a document into semantic frames and then generates text using this interim sparse format. The frames are modeled using a hypergraph, which allows perturbing the frame contents in a principled manner. Specifically, new hyperedges are mined through topological analysis and complex polyadic relationships including hierarchy and temporal dynamics are accommodated. We show that our solution generates documents that are diverse, coherent and vary in style, sentiment, format, composition and facts

arXiv.org e-Print Archive

Non-parametric hidden conditional random fields for action classification

Author: Maybank Stephen J.
Raman Natraj
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 03/11/2016
Field of study

Conditional Random Fields (CRF), a structured prediction method, combines probabilistic graphical models and discriminative classification techniques in order to predict class labels in sequence recognition problems. Its extension the Hidden Conditional Random Fields (HCRF) uses hidden state variables in order to capture intermediate structures. The number of hidden states in an HCRF must be specified a priori. This number is often not known in advance. A non-parametric extension to the HCRF, with the number of hidden states automatically inferred from data, is proposed here. This is a significant advantage over the classical HCRF since it avoids ad hoc model selection procedures. Further, the training and inference procedure is fully Bayesian eliminating the over fitting problem associated with frequentist methods. In particular, our construction is based on scale mixtures of Gaussians as priors over the HCRF parameters and makes use of Hierarchical Dirichlet Process (HDP) and Laplace distribution. The proposed inference procedure uses elliptical slice sampling, a Markov Chain Monte Carlo (MCMC) method, in order to sample optimal and sparse posterior HCRF parameters. The above technique is applied for classifying human actions that occur in depth image sequences – a challenging computer vision problem. Experiments with real world video datasets confirm the efficacy of our classification approach

Crossref

Birkbeck Institutional Research Online

Action classification using a discriminative multilevel HDP-HMM

Author: Maybank Stephen J.
Raman Natraj
Publication venue: 'Elsevier BV'
Publication date: 01/04/2015
Field of study

We classify human actions occurring in depth image sequences using features based on skeletal joint positions. The action classes are represented by a multi-level Hierarchical Dirichlet Process – Hidden Markov Model (HDP-HMM). The non-parametric HDP-HMM allows the inference of hidden states automatically from training data. The model parameters of each class are formulated as transformations from a shared base distribution, thus promoting the use of unlabelled examples during training and borrowing information across action classes. Further, the parameters are learnt in a discriminative way. We use a normalized gamma process representation of HDP and margin based likelihood functions for this purpose. We sample parameters from the complex posterior distribution induced by our discriminative likelihood function using elliptical slice sampling. Experiments with two different datasets show that action class models learnt using our technique produce good classification results

Crossref

Birkbeck Institutional Research Online

Action recognition in depth videos using nonparametric probabilistic graphical models

Author: Raman Natraj
Publication venue
Publication date
Field of study

Bayesian Hierarchical Models for Counterfactual Estimation

Author: Magazzeni Daniele
Raman Natraj
Shah Sameena
Publication venue
Publication date: 20/01/2023
Field of study

Counterfactual explanations utilize feature perturbations to analyze the outcome of an original decision and recommend an actionable recourse. We argue that it is beneficial to provide several alternative explanations rather than a single point solution and propose a probabilistic paradigm to estimate a diverse set of counterfactuals. Specifically, we treat the perturbations as random variables endowed with prior distribution functions. This allows sampling multiple counterfactuals from the posterior density, with the added benefit of incorporating inductive biases, preserving domain specific constraints and quantifying uncertainty in estimates. More importantly, we leverage Bayesian hierarchical modeling to share information across different subgroups of a population, which can both improve robustness and measure fairness. A gradient based sampler with superior convergence characteristics efficiently computes the posterior samples. Experiments across several datasets demonstrate that the counterfactuals estimated using our approach are valid, sparse, diverse and feasible

arXiv.org e-Print Archive

Synthetic Document Generator for Annotation-free Layout Recognition

Author: Raman Natraj
Shah Sameena
Veloso Manuela
Publication venue: 'Elsevier BV'
Publication date: 24/07/2022
Field of study

Analyzing the layout of a document to identify headers, sections, tables, figures etc. is critical to understanding its content. Deep learning based approaches for detecting the layout structure of document images have been promising. However, these methods require a large number of annotated examples during training, which are both expensive and time consuming to obtain. We describe here a synthetic document generator that automatically produces realistic documents with labels for spatial positions, extents and categories of the layout elements. The proposed generative process treats every physical component of a document as a random variable and models their intrinsic dependencies using a Bayesian Network graph. Our hierarchical formulation using stochastic templates allow parameter sharing between documents for retaining broad themes and yet the distributional characteristics produces visually unique samples, thereby capturing complex and diverse layouts. We empirically illustrate that a deep layout detection model trained purely on the synthetic documents can match the performance of a model that uses real documents

arXiv.org e-Print Archive

Activity recognition using a supervised non-parametric hierarchical HMM

Author: Aggarwal
Aggarwal
Dai
Devanne
Figueiredo
Fine
Han
Heller
Ishwaran
Johnson
Krishnapuram
Mcauliffe
Murphy
Natraj Raman
Rabiner
Raman
S.J Maybank
Shotton
Teh
Wang
Xia
Ye
Zhu
Publication venue: 'Elsevier BV'
Publication date: 01/07/2016
Field of study

The problem of classifying human activities occurring in depth image sequences is addressed. The 3D joint positions of a human skeleton and the local depth image pattern around these joint positions define the features. A two level hierarchical Hidden Markov Model (H-HMM), with independent Markov chains for the joint positions and depth image pattern, is used to model the features. The states corresponding to the H-HMM bottom level characterize the granular poses while the top level characterizes the coarser actions associated with the activities. Further, the H-HMM is based on a Hierarchical Dirichlet Process (HDP), and is fully non-parametric with the number of pose and action states inferred automatically from data. This is a significant advantage over classical HMM and its extensions. In order to perform classification, the relationships between the actions and the activity labels are captured using multinomial logistic regression. The proposed inference procedure ensures alignment of actions from activities with similar labels. Our construction enables information sharing, allows incorporation of unlabelled examples and provides a flexible factorized representation to include multiple data channels. Experiments with multiple real world datasets show the efficacy of our classification approach

Crossref

Birkbeck Institutional Research Online

DocLLM: A layout-aware generative language model for multimodal document understanding

Author: Babkin Petr
Kaur Simerjot
Liu Xiaomo
Ma Zhiqiang
Nourbakhsh Armineh
Pei Yulong
Raman Natraj
Sibue Mathieu
Wang Dongsheng
Publication venue
Publication date: 31/12/2023
Field of study

Enterprise documents such as forms, invoices, receipts, reports, contracts, and other similar records, often carry rich semantics at the intersection of textual and spatial modalities. The visual cues offered by their complex layouts play a crucial role in comprehending these documents effectively. In this paper, we present DocLLM, a lightweight extension to traditional large language models (LLMs) for reasoning over visual documents, taking into account both textual semantics and spatial layout. Our model differs from existing multimodal LLMs by avoiding expensive image encoders and focuses exclusively on bounding box information to incorporate the spatial layout structure. Specifically, the cross-alignment between text and spatial modalities is captured by decomposing the attention mechanism in classical transformers to a set of disentangled matrices. Furthermore, we devise a pre-training objective that learns to infill text segments. This approach allows us to address irregular layouts and heterogeneous content frequently encountered in visual documents. The pre-trained model is fine-tuned using a large-scale instruction dataset, covering four core document intelligence tasks. We demonstrate that our solution outperforms SotA LLMs on 14 out of 16 datasets across all tasks, and generalizes well to 4 out of 5 previously unseen datasets.Comment: 16 pages, 4 figure

arXiv.org e-Print Archive

Municipal Bond Pricing: A Data Driven Method

Author: Jochen L. Leidner
Natraj Raman
Publication venue: 'MDPI AG'
Publication date: 01/09/2018
Field of study

Price evaluations of municipal bonds have traditionally been performed by human experts based on their market knowledge and trading experience. Automated evaluation is an attractive alternative providing the advantage of an objective estimation that is transparent, consistent, and scalable. In this paper, we present a statistical model to automatically estimate U.S municipal bond yields based on trade transactions and study the agreement between human evaluations and machine generated estimates. The model uses piecewise polynomials constructed using basis functions. This provides immense flexibility in capturing the wide dispersion of yields. A novel transfer learning based approach that exploits the latent hierarchical relationship of the bonds is applied to enable robust yield estimation even in the absence of adequate trade data. The Bayesian nature of our model offers a principled framework to account for uncertainty in the estimates. Our inference procedure scales well even for large data sets. We demonstrate the empirical effectiveness of our model by assessing over 100,000 active bonds and find that our estimates are in line with hand priced evaluations for a large number of bonds

Directory of Open Access Journals